Final Project

dongnili&ziliangsong

A snapshot of the data set

id salinity do ph secchi_depth water_depth water_temp air_temp
Bay 1.3 11.7 7.3 0.4 0.40 5.9 8.0
Bay 1.5 12.0 7.4 0.2 0.35 3.0 2.6
  • id: Location of monitoring sites, which include a, b, bay, and c.
  • salinity: Salinity of water, in parts per thousand (ppt)
  • do: Dissolved oxygen concentration, in milligrams per liter (mg/L)
  • ph: pH value indicates the acidity or alkalinity of water.
  • secchi_depth: Visibility depth, measured using a Secchi disc, in metres.
  • water_depth: Water depth in meters.
  • water_temp: Water temperature in degrees Celsius (°C)
  • air_temp: Air temperature in degrees Celsius (°C)

Exploratory summaries

Statistical summary

Summary Statistics for Salinity and pH
Variable Min Q1 Median Mean Q3 Max Missing_Values
Salinity 0.0 0.0 0 0.9230227 1.5 9.0 0
pH 0.3 6.5 7 7.2115530 7.5 9.9 0
Summary Statistics for Location (id)
id Count Percentage
A 160 12.12
B 170 12.88
Bay 680 51.52
C 107 8.11
D 203 15.38

Exploratory summaries

Exploratory Charts

Exploratory summaries

Exploratory Charts

Exploratory summaries

Exploratory Charts

Exploratory summaries

pH variation over Salinity

Exploratory summaries(1)

pH vs. Salinity (Colored by Location)

Fitted Model Summary(1)

Model 1: Linear Relationship

Model1 equation:\(pH = \alpha + \beta_1 \cdot \text{Salinity} + \beta_2 \cdot \text{Location}\)


Call:
lm(formula = ph ~ salinity + id, data = water_nona)

Coefficients:
(Intercept)     salinity          idB        idBay          idC          idD  
     6.8875       0.1426       0.3573       0.2795       0.3600      -0.1745  
  • Pool B: Increases pH by 0.3573 units compared to Pool A, adjusted for salinity.
  • Bay: Increases pH by 0.2795 units compared to Pool A, adjusted for salinity.
  • Pool C: Increases pH by 0.36 units compared to Pool A, adjusted for salinity.
  • Pool D: Decreases pH by 0.1745 units compared to Pool A, adjusted for salinity.

Fitted Model Summary(1)

Model 1 Residual Diagnostics

Fitted Model Summary(1)

Model 2: Transforming the Response Variable

Model 2 equation: \(\sqrt{Y} = \alpha + \beta_1 \cdot \text{Salinity} + \beta_2 \cdot \text{Location}\)

Exploratory summaries(2)

Exploratory Charts

Exploratory summaries(2)

Correlation Matrix Heatmap

Fitted Model Summary(2)

Secchi Depth Model 1: Linear Relationship

\(Secchi Depth = α + β1*Salinity + β2*Dissolved Oxygen +\\ β3*pH + β4*WaterDepth + β5*WaterTemp + β6*AirTemp\)

Start:  AIC=-3703.08
secchi_depth ~ salinity + do + ph + water_depth + water_temp + 
    air_temp

              Df Sum of Sq     RSS     AIC
<none>                      78.998 -3703.1
- water_temp   1     0.254  79.252 -3700.8
- air_temp     1     0.431  79.428 -3697.9
- ph           1     0.577  79.575 -3695.5
- do           1     0.629  79.627 -3694.6
- salinity     1     1.751  80.749 -3676.1
- water_depth  1   213.016 292.014 -1979.3

Fitted Model Summary(2)

Secchi Depth Model 1 Residual Diagnostics

Fitted Model Summary(2)

Model 2: Transforming the Response Variable

\(sqrtY = α + β1*Salinity + β2*DO + β3*pH \\+ β4*WaterDepth + β5*WaterTemp + β6*AirTemp\)

Fitted Model Summary(2)

Model 2: Transforming the Response Variable

Variable VIF
Salinity 1.273551
Dissolved oxygen 1.653153
pH 1.177584
Water Depth 1.033281
Water Temp 2.196920
Air Temp 1.659053
Variable AIC
pH -5180.3
Air Temp -5178.6
Water Temp -5178.4
none -5178.3
Dissolved Oxygen -5170.1
Salinity -5142.9
Water Depth -3836.2
  • Final Regression Model After Stepwise Selection:\(sqrtY = α + β1*Salinity + β2*DO\\+ β3*WaterDepth + β4*WaterTemp\)

Exploratory summaries(3)

Exploratory Charts

Exploratory summaries(3)

Exploratory Charts

Exploratory summaries(3)

Water Temperature Variation over Dissolved Oxygen

Fitted Model Summary(3)

Model Residual Diagnostics

Model1 equation:\(Dissolved Oxygen = \alpha + \beta_1 \cdot \text{Water Temperature}\)

Thank you